On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts
نویسندگان
چکیده
One of the intuitions underlying many graph-based methods for clustering and semi-supervised learning, is that class or cluster boundaries pass through areas of low probability density. In this paper we provide some formal analysis of that notion for a probability distribution. We introduce a notion of weighted boundary volume, which measures the length of the class/cluster boundary weighted by the density of the underlying probability distribution. We show that sizes of the cuts of certain commonly used data adjacency graphs converge to this continuous weighted volume of the boundary. keywords: Clustering, Semi-Supervised Learning
منابع مشابه
Appendix to: On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts
A Regularity conditions on p and S We make the following assumptions about p: 1. p can be extended to a function p that is L−Lipshitz and which is bounded above by p max. 2. For 0 < t < t 0 , min(p(x), K t (x, y)p(y)dy) ≥ p min. Note that this is a property of both of the boundary ∂M and p. We note that since p is L−Lipshitz over R d , so is M K t (x, z)p (z)dz. We assume that S has condition n...
متن کاملBeyond Spectral Clustering - Tight Relaxations of Balanced Graph Cuts
Spectral clustering is based on the spectral relaxation of the normalized/ratio graph cut criterion. While the spectral relaxation is known to be loose, it has been shown recently that a non-linear eigenproblem yields a tight relaxation of the Cheeger cut. In this paper, we extend this result considerably by providing a characterization of all balanced graph cuts which allow for a tight relaxat...
متن کاملThe f-Adjusted Graph Laplacian: a Diagonal Modification with a Geometric Interpretation
Consider a neighborhood graph, for example a k-nearest neighbor graph, that is constructed on sample points drawn according to some density p. Our goal is to re-weight the graph’s edges such that all cuts and volumes behave as if the graph was built on a different sample drawn from an alternative density p. We introduce the f -adjusted graph and prove that it provides the correct cuts and volum...
متن کاملA Feature Space View of Spectral Clustering
The transductive SVM is a semi-supervised learning algorithm that searches for a large margin hyperplane in feature space. By withholding the training labels and adding a constraint that favors balanced clusters, it can be turned into a clustering algorithm. The Normalized Cuts clustering algorithm of Shi and Malik, although originally presented as spectral relaxation of a graph cut problem, ca...
متن کاملDetecting Overlapping Communities in Social Networks using Deep Learning
In network analysis, a community is typically considered of as a group of nodes with a great density of edges among themselves and a low density of edges relative to other network parts. Detecting a community structure is important in any network analysis task, especially for revealing patterns between specified nodes. There is a variety of approaches presented in the literature for overlapping...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006